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ABSTRACT 

We present a dependently typed assembly language (DTAL) 
in which the type system supports the use of a restricted 
form of dependent types, reaping some benefits of dependent 
types at the assembly level. DTAL improves upon TAL, 
enabling certain important compiler optimizations such as 
run-time array bound check elimination and tag check elim¬ 
ination. Also, DTAL formally addresses the issue of rep¬ 
resenting sum types at assembly level, making it suitable 
for handling not only datatypes in ML but also dependent 
datatypes in Dependent ML (DML). 

1. INTRODUCTION 

A compiler for a realistic programming language is often 
large and complex. Though it is highly desirable to establish 
the correctness of such a compiler, there seems no effective 
approach to reaching this goal currently. Instead, the on¬ 
going research on certifying compilers attempts to partially 
address this problem from a different angle. 

Suppose we have a compiler that translates source pro¬ 
gram e into target code |e|; if e possesses some property P 
(e.g. e is terminating) that we know |e| must also possess if 
the compiler is implemented correctly, we can then design 
the compiler to produce a verifiable certificate asserting that 
|e| possesses the property P; if the certificate is successfully 
verified, our confidence in the compiler is raised; otherwise, 
a compiler error needs to be located and then fixed. 

In DML [18, 13], a functional programming language that 
supports the use of a restricted form of dependent types, a 
well-typed program is both type safe (which excludes, for ex¬ 
amples, programs that attempt to add an integer to a float¬ 
ing point number) and memory safe (which excludes stray 
memory accesses). If we compile a well-typed program in 
DML into some target code at assembly level, the target 
code should also be both type safe and memory safe. Obvi¬ 
ously, the immediate question is how both type safety and 
memory safety can be captured at assembly level. In this 
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paper, we address this question by designing a dependently 
typed assembly language in which the dependent types can 
capture both type safety and memory safety. 

Specific approaches to certification include proof-carrying 
code (PCC) (adopted in Touchstone [8]) and type systems 
(adopted in TIL [11]). In PCC, both type safety and mem¬ 
ory safety are expressed by (first-order) logic assertions about 
program variables and are checked by a verification condi¬ 
tion generator and a theorem prover, and code is then certi¬ 
fied by an explicit representation of the proof. In TIL, type 
safety is expressed by type annotations and is checked by 
a type checker and no additional certification is required. 
The Touchstone approach draws on established results for 
verification of first-order imperative programs. The TIL ap¬ 
proach draws on established methods for designing and im¬ 
plementing type systems, making it unclear (a priori) that 
it can be extended to low-level languages or to account for 
memory safety. 

Typed Assembly Language [7] is introduced by Morrisett 
et al., where a form of type system is designed at assembly- 
level suitable for compiling functional languages and a com¬ 
pilation from System F to TAL is given. TAL provides both 
type safety and memory safety, but at the cost of making 
critical instructions such as array subscripting atomic to en¬ 
sure memory safety. For instance, each array subscripting 
instruction in TAL involves checking whether a given array 
index is between the lower and upper bounds of the array 
before fetching the data item. 

We enrich TAL to allow for more fine-grained control over 
memory safety so as to support array bound check elimina¬ 
tion, hoisting bound checks out of loops, efficient representa¬ 
tion of sum types, etc. We draw on the formalism of depen¬ 
dent types to extend TAL with such a concept. However, 
we cannot rely directly on standard systems of dependent 
types [4] for languages with computational effects. For in¬ 
stance, it is entirely unclear what it means to say that A 
is an array of length x for some mutable variable x: if we 
update x with a different value, this changes the type of 
A but A itself is unchanged! Drawing on our experience 
with a restricted form of dependent types in DML [18], we 
introduce a clear separation between ordinary run-time ex¬ 
pressions and a distinguished family of index expressions, 
linked by singleton types of form int(a:): every integer ex¬ 
pression of type int(a:) must have value equal to x. The 
index expressions are chosen from an integer domain in this 
paper. Given an expression e (in DML), checking whether 
e has type int(a:) (written as e : int(a:)) involves non-trivial 
equational reasoning about the run-time behavior of e. For 





{m:nat, n:nat I m <= nf 
void copy(int src [m] , int dst[n]) { 
var: int i, length;; 
length = arraysize(src); 
for (i = 0; i < length; i = i + 1) { 
dst[i] = src[i]; 

> 

> 

Figure 1: A copy function in Xanadu 


instance, e : int(3) means that e, when evaluated, must eval¬ 
uate to 3. Clearly, 3 : int(3), and perhaps, 1 + 2 : int(3), but 
it is, in general, undecidable whether an arbitrary (possibly 
effectful) e has type int (3). This is where theorem proving 
and constraint satisfaction comes into the picture. 

ft is difficult to read assembly code. In the following pre¬ 
sentation, we will occasionally use programs in Xanadu [15], 
a dependently typed imperatively programming language 
with C-like syntax, to facilitate the presentation of DTAL. 
We could also use programs in DML for this purpose but the 
great difference between DML and DTAL would make this 
alternative less desirable. The Xanadu program in Figure 1 
implements a copy function on arrays. The function header 
in the program states that for all natural numbers m and 
n satisfying m < n the function takes two integer arrays of 
sizes m and n, respectively, and returns no value. Note that 
{m: nat, n: nat I m <= n> is a universal quantifier and 
int src [m] and int dst [n] 

mean that src and dst are integer arrays of sizes m and 
n, respectively. We use var: to start variable declaration, 
which ends with ; ;. Furthermore, the function arraysize 
returns the size of an array. Note that the type index m is 
not available at run-time and we use arraysize here to get 
an integer equal to m (or literally, an integer of type int(m)). 

The DTAL code in Figure 2 corresponds to the Xanadu 
program. Note that rl,... ,r5 are registers. The instruc¬ 
tion arraysize r3, rl is non-standard, which means that 
we store into r3 the size of the array to which rl points. 
The branch instruction bgte r5, finish jumps to the la¬ 
bel finish if the integer in r5 is greater than or equal to 
zero. Also load r5, rl(r4) means that we store into r5 
the content of the ith cell in the array to which rl points, 
where i is the integer stored in r4. The store instruction is 
interpreted similarly. 

Every label in the code is associated with a dependent 
type. The dependent type associated with the label loop 
basically means that there exist a natural number m and a 
natural number n satisfying m < n and a natural number i 
such that rl, r2, r3, r4 are of types int array(m), int 
array(n), int(m), int(i), respectively, that is, they are 
an integer array of size m, an integer array of size n, an 
integer of value m and an integer of value i. This enables 
us to state, for instance, that the type of rl depends on the 
value in r3. The type system of DTAL guarantees that these 
properties are satisfied when the code execution reaches the 
label loop. 

The DTAL code is well-typed, which guarantees that the 
integer in r4 is always a natural number and its value is al¬ 
ways less than the size of the array to which rl (r2) points 


when the load (store) instruction is executed. 1 In other 
words, it can be statically verified that there is no need 
for run-time array bound checking in this case. Although 
this is a very simple example, it is nonetheless impossible 
to infer that the store instruction is safe without the depen¬ 
dent type associated with the label loop. In DTAL, array 
access is separated from array bound checks and the type 
system of DTAL guarantees that the execution of well-typed 
DTAL can never perform out-of-bounds array access, ft is 
this separation that makes array bound check elimination 
possible. In the case where it is impossible to prove in the 
type system of DTAL whether an array access may be out- 
of-bounds, run-time array bound checks can be inserted to 
ensure safety. 

We also address in DTAL the issue of representing sum 
types at assembly level. Furthermore, we demonstrate how 
dependent datatypes in DML can be translated into DTAL, 
allowing, for instance, an implementation of the list reverse 
function in DTAL that uses the type system of DTAL to 
guarantee this function to be length-preserving. 

In a realistic setting, machine-level arithmetic is often 
modulo a power of 2, say, 2 32 . This can be readily handled 
in our framework. For instance, we can assign the follow¬ 
ing type to + for handling (unsigned) addition modulo 2 32 , 
where int 32 is the sort {a : int \ 0 < a < 2 32 }. 

ffi : mf32.IT/ : mf32.int(f) * int(j) —> int((i + j) mod 2 32 ) 

The reason that we do not treat modulo arithmetic in this 
paper is merely for a less involved presentation. 

The main contribution of the paper is a formulation of a 
dependent type system for a language at assembly level that 
(a) is non-trivial for reasons outlined previously, (b) gener¬ 
alizes TAL to allow for capturing significant loop-based op¬ 
timizations, (c) yields an application of dependent types to 
managing low-level representation of sum types, setting up 
some machinery needed for compiling dependent datatypes 
supported in DML into assembly level, and (d) provides an 
approach to certification based on type-checking. One trade¬ 
off is that we presume that the constraint solver is part of 
trusted computing base in order for the recipient to verily 
the code it receives. Future work might include some means 
of formally representing proofs of constraints so that the 
constraint solver can be moved out of the trusted comput¬ 
ing base. 

Also, it is to be studied what are the advantages and dis¬ 
advantages of using a DTAL-like language as the target lan¬ 
guage of a compiler. When compared with the work in DML 
and Xanadu, novelties in DTAL include: 

• Datatype representation at assembly level. For in¬ 
stance, assume that a function in DML is given the 
type Ida : nat.{a)list(n) —> ( a)list(n ), that is, it is 
length preserving; how can such a property be trans¬ 
lated into low-level code? 

• Control flow at assembly level that involves dependent 
types. There are simply no jumps, conditional or un¬ 
conditional, in either DML or Xanadu, but we have to 
deal with such language features in DTAL. 

fn general, the design of DML and Xanadu is more con¬ 
cerned with type inference while the design of DTAL is more 

1 This point should become clear if one reasons about in¬ 
struction 4 and 5 in the code. 





00. copy: {m:nat, n:nat I m <= n} [rl: int array(m), r2: int array(n)] 

01. arraysize r3, rl // obtain the size of source array 

02. mov r4, 0 // initialize the loop count to 0 

03. loop: {m:nat, n:nat I m <= n, i:nat} 

[rl: int array(m), r2: int array(n), r3: int(m), r4: int(i)] 
04. sub r5, r4, r3 // r5 <- r4 - r3 

05. bgte r5, finish // r4 >= r3 

06. load r5, rl(r4) // safe load 

07. store r2(r4), r5 // safe store 

08. add r4, r4, 1 // increase the count by 1 

09. jmp loop // loop again 

10. finish: [] 

11. halt // it can also return to the caller if needed 

Figure 2: A copy function implemented in DTAL 


concerned with type checking as the types in DTAL are to 
be generated by a compiler. For instance, some of the typ¬ 
ing rules in DTAL are not syntax directed, and annotations 
may need to be generated by a compiler in DTAL code to 
direct type-checking. We consider this to be a crucial point 
in the design of DTAL. 

We organize the paper as follows. The syntax of DTAL 
is given in Section 2. We then form evaluation and typ¬ 
ing rules so as to assign dynamic and static semantics to 
DTAL, respectively. We, however, postpone until Section 3 
the treatment of constraints, which are generated during 
type-checking programs in DTAL. In Section 4, we give a de¬ 
tailed example explaining how type-checking is performed in 
DTAL. The soundness of the type system of DTAL is stated 
in Section 5 and an extension of DTAL to handle sum types 
is given in Section 6. We then in Section 7 mention a type- 
checker for DTAL and a compiler which compiles Xanadu, a 
language resembling Safe C [9] and Popcorn [6] with C-like 
syntax, into DTAL. The rest of the paper discusses some 
closely related work and future directions. 

2. DTAL 

In this section we present a dependently typed assembly 
language (DTAL), forming both dynamic and static seman¬ 
tics for DTAL. 

2.1 Syntax 

We assume that there are a fixed number n r of regis¬ 
ters. A register file R is a finite mapping from the set 
{0,1,... , n r — 1} into types. The intent is to capture some 
type information on registers with R. The syntax for DTAL 
is given in Figure 3. Note that stacks, which are treated in 
[16], are omitted here for simplicity, though we do use stacks 
in some code example. One may simply think of a stack as 
an infinite list of registers. Also, we omit tuples, which can 
be handled as in TAL. 

Intuitively speaking, dependent types are types which de¬ 
pend on the values of language expressions. For instance, 
we may form a type (int)array(a;) to mean that every heap 
pointer of this type points to an integer array of size x, 
where x is the expression on which this type depends. We 
use the name type index expression for such an expression. 
We restrict type index expressions to an integer domain. 
The justification for this choice is that we have used this 


domain to eliminate array bound checks effectively [17]. 

We present the syntax for type index expressions in Fig¬ 
ure 4, where we use a to range over type index variables and 
i for fixed integers. Note that the language for type index 
expressions is typed. We use sorts for the types in this lan¬ 
guage in order to avoid potential confusion. We use • for the 
empty index context and omit the standard sorting rules for 
this language. The subset sort {a : 7 | P} stands for the sort 
for those elements of sort 7 satisfying the proposition P. For 
example, we use nat as an abbreviation for {a : int \ a > 0}. 

We postpone the treatment of constraint satisfaction in 
this type index language until Section 3 for simplicity of 
exposition. However, we informally explain the need for 
constraints through the DTAL code in Figure 2. Notice that 
register r4 is assumed to be of type int(*i) for some natural 
number ii when the execution reaches the label loop. The 
type of r4 changes into int (/j + 1) after the execution of 
the instruction add r4, r4, 1. Then the execution jumps 
back to the label loop. This jump requires it to be verified 
(among many other requirements) that r4 is of type int (*2) 
for some natural number l'i. Therefore, we need to prove 
that *i + 1 is a natural number under the condition that ii 
is a natural number. This is a constraint, though it is trivial 
in this case. In general, type-checking in DTAL involves 
solving a great number of constraints of this form. 

We use top for the type of uninitialized registers and as¬ 
sume that a register is initialized if it is not of type top. A 
block B = XA\di.(R, I) roughly means that B is polymor¬ 
phic in type variable context A and index variable context 
0. We may omit AA (A0) if A (0) is empty. In order to 
execute the block on an abstract machine, we need to find 
substitutions 0 and 6 for A and 0, respectively, such that 
the current machine state entails the state -R[0] [0] and then 
execute 7[0][0]. The entailment of R means that the type 
assignment to registers in R correctly reflects the types of 
registers in the current abstract machine. For instance, if R 
indicates that an integer is in a register r, then an integer 
must be stored in r in the abstract machine. A state type 
state(\A.\(j>.R), when associated with a label, means that 
there are substitutions 0 and 6 for A and 0, respectively, 
such that the current abstract machine state entails 7?[0][0] 
whenever the execution reaches the label. The explanation 
here assumes that we carry types around when we evaluate 
DTAL code. Of course, we do not actually need to carry 





type variables 
state types 

a 

= 

state(XA.X<t>.R) 

regfile types 

R 

= 

[r 0 : r 0 ,... ,r nr - 1 : r nr ._ 1] 

types 

r 

= 

a | <7 | top | unit \ int(a;) | r array(a:) | 

type erasures 

€ 

= 

a top unit int e array 

type variable contexts 

A 

= 

•tv | A, a 

registers 

r 

:= 

r 0 ,... ,r„ r - 1 

instructions 

ins 

= 

aop Vd, 7*3, u | bop r, v \ arraysize ra, 1 

fixed integers 



movr,w | load rd, r s (v) | store ra(v), 
newarray[r] r,r',r" \ jmp v \ halt 
--- | -1 | 0 | 1 | 

constants 

c 

:= 

0 M z 

values 

V 

: = 

c|r 

instruction sequences 

I 

: = 

jmp v | halt | ins; I 

blocks 

B 

:= 

XA.Xcf).(R, 7) 

arithmetic ops 

aop 

:= 

add | sub | mul | div 

branch ops 

bop 

:= 

beq j bne j bit j bite | bgt | bgte 

labels 

label mappings 

l 

A : 

: = 

{Zi : (Tt ./„ : a„} 

programs 

P : 

:= 

h :B 1 ;... ;l n :B n 


Figure 3: Syntax for DTAL 


index variables a 

index expressions x, y 

index propositions P 

index sorts 7 

index contexts <j> 


a \ i \ $ + y \ x — y \ x * y \ x -r y 

x<y\x<y\x = y\ x>y\x>y \ -iP \ Pi A P 2 | Pi V Pi 
int | [a : 7 | P} 


Figure 4: Syntax for type index expressions 


P = (copy : Bi, loop : P 2 , finish : S3) 

A(S) = {copy : cri, loop : <72, finish : <73} 

J(S) = copy; 7i; loop; 7 2 ; finish; halt 

51 = A (m : nat,n : nat,m < n).(Ri, Ii) 

5 2 = A (to : nat, n : nat, m < n,i : nat).(Ri, 7 2 ) 

53 = (Rempty,h alt) 

<7i = state(X(m : nat,n : nat,m < n).Ri) 

cri = state(X(m : nat, n : nat, m <n,i : nat).Ri) 

(Tz = State (Rempty ) 

Figure 5: The representation of the program in Fig 2 


types around in practice when we evaluate DTAL code as it 
is clear types play no role in evaluation of DTAL code. This 
is precisely like the case where a well-typed ML program is 
evaluated. 

We use J for a general instruction sequence in the follow¬ 
ing presentation, which consists of a sequence of instructions 
or labels. Given a block S = XA.Xcf>.(R, I), we write cr(S) 
for state(XA.X<j>.R) and 1(B) for 7. Also we define functions 
A and J on program P = l\ : Si;... ; l n : B r , as follows. 

A (P) = {h '■ cr(Bi ),... ,l n : cr(S n )} 

J(P) = S;7(Si);... ;l„;I(B n ) 

We refer A(S) as the label mapping of P, in which we re¬ 
quire that all labels be distinct. For a valid program P, 
all labels in J(P) must be declared in A(S). In all the 


examples of DTAL code that we present in this paper, we 
attach the state type a of a label l to the label explicitly 
in the program, and the label mapping of the program can 
be immediately extracted from the code if necessary. We 
explain these definitions in Figure 5, where the program P 
is given in Figure 2; 7i and 7 2 are the sequences of instruc¬ 
tions between the labels copy and loop and those between 
labels loop and finish, respectively. Si is a mapping which 
maps 1 and 2 to (int)array(m) and (int)array(n), respec¬ 
tively, and Ri(i) = top for i ^ 1,2; S 2 maps 1, 2, 3 and 4 to 
(int)array(m), (int)array(n), int(m) and int(«), respectively, 
and Ri(i) = top for i ^ 1,2,3,4; R e mpt y (i) = top for i in 
all its domain. Note that we write int for 3a : mt.int(a), 
that is, int is the sum of all singleton types int (a), where a 
ranges over integers. 

The following erasure function || • || transforms types into 
type erasures, that is, non-dependent types. 

|| top || = top \\unit\\ = unit ||a|| = a ||int(a;)|| = int 
||cr|| = unit ||t array(a:)|| = ||r|| array ||30.r|| = ||r|| 

It can be readily verified after the presentation of DTAL that 
DTAL becomes a TAL-like language if one erases all syntax 
related to type index expressions. In this TAL-like language, 
the erasure of a program is well-typed if it is well-typed in 
DTAL. In this respect, DTAL generalizes TAL. We stress 
the erasure property because it indicates that DTAL does 
not make more programs typable than TAL but, instead, 
can assign more accurate types to programs. 










2.2 Dynamic Semantics 

We use an abstract machine for assigning operational se¬ 
mantics to DTAL, which is a standard approach. A machine 
state Ad is a pair (Tt, TV), where TL and 71 are finite mappings 
which stand for heap and register file, respectively. 

The domain dom(7f) of 77 is a set of heap addresses, the 
domain dom(77) of 72, is {0,... , n r — 1}. We do not specify 
how a heap address is represented, but the reader can simply 
assume it to be a natural number. Given h £ dom(7f), 77(h) 
is a tuple (hco, ■ ■ ■ , hc n - 1) such that for i = 0,... ,n — 1, 
every hci is either a heap address or a constant. Given 
i £ dom(77), 7Z(i) is either a heap address or a constant. 

Given a program P, A = A (P) associates every label in 
J ~ J(P) with a state type a. We use length(J) for the 
length of the sequence J, counting both instructions and 
labels. We use J(i) for the ith item in J, which is either 
an instruction or a label. Also we write J _1 (l) for i if l 
is J(i). This is well-defined since all labels in a program 
are distinct. We define a P-snapshot Q as either HALT or 
a pair (ic, M) such that 0 < ic < length(J). The relation 
(ic, Ad) —>p (ic', Ad') means that the current machine state 
Ad transforms into Ad' after executing the instruction J(ic) 
and the instruction counter is set to ic'. 

Given Ad = (77, TV), we define the following. 


M(v) = 


<> if ^ is <>; 

i if v is integer i; 

l if v is label 1; 

7Z(i) if v is the ith register n. 


Given a finite mapping / and an element x in the domain of 
/, we use f(x) for the value to which / maps x, and f[x i—► v] 
for the mapping such that 

&»«!(»)={ tizT*' 

Clearly, f[x i—> i>] is also meaningful when x is not already in 
the domain of /. In this case, we simply extend the domain 
of / with x. 

We use the notation 72.[r* i—*• he] to mean that we update 
the content of register r with he, that is, 7Z[r i—> he] is really 
7Z(i i—> he], where i is the numbering of register r. Also we 
use Ad[r h-» he] for (77,7Z[r i—> he]) given Ad = (77,71). 

We present some evaluation rules for DTAL in Figure 6. 
We do not consider garbage collection in this abstract ma¬ 
chine, and therefore the typing of the heap can only be af¬ 
fected by the memory allocation instructions newarray. No¬ 
tice that the rules (eval-load) and (eval-store) imply that 
an out-of-bounds array access stalls the abstract machine. 
These rules also indicate that the length of the tuple 7T(h) 
can always be determined for every h £ dom(Tf) at run¬ 
time. We will soon design a type system for DTAL and prove 
that 0 < i < n in both rules (eval-load) and (eval-store) 
always holds when these rules are applied during the eval¬ 
uation of a well-typed DTAL program. Therefore, there is 
no need for determining the length of the tuple 77(h) for ev¬ 
ery h £ dom(7d) if we only evaluate well-typed DTAL pro¬ 
grams. In the case where it cannot be determined in the type 
system of DTAL whether a subscript is within the bounds 
of an array, the array subscripting instruction is ill-typed 
and thus rejected. This sounds like a severe restriction, but 
it is not because we can always insert run-time array bound 
checks to make the instruction typable in DTAL (we give 
such an example at the end of Section 2.3). 


t>- A; R\~ a () : unit 


<!>', A; R \~a i '■ int(i) 

A(Q = g 
0;A;7?b A l 
0 <i<n r 
0; A; R b A n : R(i) 
f>; A; R \~\ v : n 0; A |= n < r2 


(type-int) 


(type-label) 

(type-reg) 

(type-sub) 


0; A; R \~a v : T2 

Figure 7: Typing rules for integers, labels, registers 


The rule (eval-newarray) is non-standard. If Jfe§ is 
of form (e)array, then newarray [r]r, r', r" allocates n new 
word-sized memory on heap, where n is the integer stored in 
r', and initializes each word with the content in r" and then 
stores a point in r which points to the allocated memory. We 
emphasize that h must be new in the rule (eval-newarray), 
that is, h is not already in the domain of 77. The typing con¬ 
sequences of this memory allocation instructions is explained 
in the next section, where the typing rule (type-newarray) 
is introduced. 

Let us call a program well-structured if its evaluation halts 
normally (when the rule (eval-halt) is applied) or continues 
forever. In other words, the evaluation of a well-structured 
program can never be stuck. Certainly it is undecidable to 
determine precisely whether a program is well-structured, 
but this is also less relevant. We intend to find a conser¬ 
vative approach to examining whether a program is well- 
structured. Such an approach must be sound, that is, it 
can only accept well-structured programs. For instance, a 
straightforward approach is to adopt a method based on 
TAL for type-safety and then insert run-time checks for all 
array operations. Unfortunately, this approach seems too 
conservative, making it impossible to eliminate array bound 
checks. Notice that this is essentially the case in all JVML 
verifiers. In the next section, we present a less conservative 
approach based on a dependent type system. 

2.3 Static Semantics 

We present the typing rules for DTAL in this section. Note 
that we use an array representation for a register file R. We 
omit the standard rules for forming legal types and assume 
that all types are well-formed in the following presentation. 

We use a judgment of form 0; A; R b A v : r to mean that 
value v is assigned type r under the context 0; A; R and 
the label mapping A. The label mapping A is always fixed 
when we type-check a program, and therefore we will omit 
it if this causes no confusion. The rules in Figure 7 are for 
typing unit, integers, labels and registers. 

We present some typing rules for DTAL in Figure 8. We 
use 6 and 0 for index and type variable substitutions, re¬ 
spectively, which are defined as usual. Given a term • such 
as a type or a register file, we write «[0] (•[<?]) for the result 
of applying 0 (9) to •. A judgment of form 0; A; R b I 
means that the instruction sequence I is well-typed under 
context 0; A; R. The notation R[r : r] means that we up¬ 
date the type of register r to r in R, that is, if r is the ith 
register, then we update the value of R(i) with r. We use 




§fj = (e)array J(ic) = newarrayjr] 


M(r') = 


> 0 h dom(?7) 


(ic, M) -p (ie + 1 ,M[h~ (M(r"),..., M(r"))][r ~ ft]) 

J(ic) = load r d ,r s (v) H(M(r s )) = (hc 0 ,... ,hc n - 1 ) M(v) = i 0 <i< 
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(eval-newarray) 


(eval-load) 
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J(ic) = halt 
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Figure 6: Some evaluation rules for DTAL 
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(type-newarray) 


0; A; 77 h load rd, r s (v); 7 
0; A; 77 h Td : t array(aj) 0; A; 77 h v : int(2/) 0 |= 0 < j/ < a: 

0; A; 77 h store rd(v),v s ; I 

0; A; 77 h t) : state(\A',\(j) .R') 0 h 0 : (j> 0; A h 0 : A' 


(type-load-array) 


>; A; 77 h : r 0; A; 77 b 7 


(type-store-array) 


6; A; 7? K 7?'[0][<9] 


0; A; 7? h jmp v; 7 
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0,z = 01-0:0' 0,x = O; Ah 0 : A' o, x = 0; A; /?. |= c 77- [0] \()\ 

0; A; 77 h beq r, u; 7 


(type-jmp) 


(type-beq) 


Figure 8: The typing rules for DTAL 


the rule (type-newarray) for typing arrays allocated on 
heap. We have explained in the previous section how mem¬ 
ory allocation is performed. Also we require that the index 
variables declared in 0' in the rule (type-open-reg) have 
no free occurrence in the conclusion of the rule. 

The typing rule (type-add) indicates that the type of reg¬ 
ister rd become int(x+j/) after the instruction add rj, r s , v is 
executed, where we assume that r s and v have types int(x) 
and int(?/), respectively. If arithmetic overflow is to be con¬ 
sidered, we may require the instruction to be followed by 
an instruction that traps overflow; if an overflow occurs, we 
jump to a subroutine to handle it; otherwise, we know 
indeed has type int(x + y). 

We give some explanation on the rule (type-beq). We 
use 0 b 9 : 0' to mean that 6 is a substitution for 0' under 
0, that is, for every a : 7 declared in 0', 0 b 0(a) : 7 is 
derivable and for every P in 0', 0 |= P[0\ is satisfiable. 
The explanation for 0; A b 0 : A' is similar. Suppose that 
we type-check beq; r, v; I under 0; A; 77; we first check that 
r has type int(x) for some x; we then type-check 7 under 
0, x =£ 0; A; 7? (x 0 is added into 0 since the jump is not 
taken in this case); we also verify that v has a state type 


and 0, x = 0; A; 7? entails the state type (x = 0 is added to 
0 since the jump is taken in this case). The typing rules for 
other conditional jumps are similar. 

We sketch a case where a DTAL program that does not 
type-check can be modified to type-check with the insertion 
of a run-time array bound check. Assume that we want 
to type-check load rd,r s (v)',I under 0;A;77, and we have 
verified that r s and v have types r array(x) and int(?/), re¬ 
spectively, and we can prove 0 |= 0 < y but not 0 |= y < x; 
we can then insert the following (where subscript is the en¬ 
try to some routine that handles errors) in front of the load 
instruction, and this insertion guarantees that x—y > 0 is al¬ 
ready added to 0 when the load instruction is type-checked, 
making sure that y < x is provable. 

arraysize r, r s ; sub r, r, v; bite r, subscript; 

A dual case is to remove a redundant array bound check. For 
instance, we want to type-check bit r, subscript; 7 under 
0; A; 77; suppose that r has type int(x) for some x and 0 |= 
x > 0 can be proven; this implies that bit r, subscript can 
never branch and thus this instruction can be removed. 

We use b P [well-typed] to mean that a program P = (l 1 : 







B i,... ,l„ : B n ) is well-typed and the following rule is for 
typing a program, where A is the label mapping of P. 

a -Bi [well-typed] • • • \~a B n [well-typed] 

h P [well-typed] 


Given a block B = AA.A0.(P,/), the rule for deriving \~a 
B [well-typed] is given as follows. 


0;A;Ph A I 

\~a P [well-typed] 


(type-block) 


3. TYPE EQUALITY AND COERCION 

As we have mentioned before, a novelty in DML is the 
separation between language expressions and type index ex¬ 
pressions. This notion of separation seems indispensable 
when we intend to form a dependent type system for an 
imperative language such as DTAL. For instance, it is com¬ 
pletely unclear at the moment how a register can be used as 
a type index expression, since it is mutable. The separation 
allows us to simply avoid such a problematic issue. Another 
advantage is that the separation enables us to choose a rel¬ 
atively simple domain for type index expressions so that 
constraints (on type index expressions) generated during 
type-checking can be efficiently solved. This is crucial to 
the design of a practical type-checking algorithm. In this 
section, we present type equality and coercion, which lead 
to constraint generation in type-checking. 

In the presence of dependent types, it is no longer trivial 
to check whether two types are equivalent. For instance, 
we have to prove that the constraint 1 + 1 = 2 in order to 
claim int(l + 1) is equivalent to int(2). In other words, type 
equality is modulo constraint satisfaction. Similarly, type 
coercion also involves constraint satisfaction. 

We use $ for index constraints, 


For instance, the following derivation shows that the type 
3a : nat.int(a) coerces into the type 36 : int. int(6), where the 
the top applied rule is (coerce-exi-ivar-r) and the other is 
(coerce-exi-ivar-1). 

a : nat \= a : int a : not; • (= int(a) < int(a) 
a : nat; ■ |= int(a) < 36 : int. int(6) 

•; • |= 3a : nat.int(a) < 36 : m£.int(6) 

We have so far finished the presentation of the type system 
of DTAL, which is rather involved. We will use a concrete 
example in the next section to provide some explanation on 
type-checking before proceeding to establish the soundness 
of the type system. 

4. AN EXAMPLE 

We demonstrate some key steps involved in type-checking 
the DTAL code in Figure 2. We stick to the notations 
given in Figure 5. Let insi be the ith instruction and h,i 
be insi;... ;insg for 4 < i < 9. In order to derive h 
P 2 [well-typed], that is, to type block P 2 , we need to derive 
the following. 

m : nat, n : nat, m <n,i : nat; ■; R2 h I2 

Then there must be derivations T>i with a conclusion of form 
t i>i;Ai;Ri \- I 2 ,i for i = 4,... ,9. We list these contexts 
(f>i',Ai; Ri in Figure 10. In the derivation of 4>e;A&;Rs h h, 
the last rule is (type-load-array), where we need to prove 
06 |= 0 < * < to. This is trivial since i : nat and i < m are 
assumed in 06- Similarly, we need to prove 07 |= 0 < * < n 
when deriving 07 ;At;J ?7 h I7. This is also trivial since 
m<n,i : nat, i < m are assumed in 0 7 . 

5. SOUNDNESS 


$ ::g£ ; | /* /> 5 <I> Va 

and 0 |= P for a satisfiability relation, stating that (0)P is 
satisfiable in the domain of integers, where (0)P is defined 
below. 

(.)$ = $ (0, a : int)$ = (0)Va : int.$ 

(0 ,a:{a: 7 |P})<h = (0,a:7)(-PA4>) 

(0,P)$ = (0)(PD$) 

For instance, the satisfiability relation a : nat, 6 : int, a+1 = 
6 |= 6 > 0 holds since the following formula is true in the 
integer domain. 

Va : int.a > 0 D V6 : int.a + |7@i6 D 6 > 0 

We currently only accept linear constraints, using linear in¬ 
teger programming to solve them. Though the constraint 
satisfaction is NP-complete, most constraints in practice are 
efficiently solved. 

We write 0; A |= n = r 2 to mean that types n and r 2 are 
equal under context 0; A. Similarly, we write 0; A |= n < r 2 
to mean that type n coerces into type r 2 under context 
0; A. Note that type coercion can simply be view as a form 
subtyping here. Some rules for type coercion are presented 
in Figure 9. Notice that for the rule (coerce-exi-ivar-1), 
there is an obvious side condition requiring that the type 
t 2 does not contain free occurrences of the index variables 
declared in 0'. 

The rules for type equality are similar and thus omitted. 


By the type soundness of DTAL, we essentially mean that 
the evaluation of well-typed DTAL code either halts nor¬ 
mally (when the instruction halt is executed) or goes on 
indefinitely. The main ingredient in the proof of the type 
soundness of DTAL is an entailment relation, for which we 
present a brief explanation. 

Given a program P, we use J for the list consisting of 
labels and instructions in P and J[ic] for the suffix of J 
starting with the ic th item in J. Assume 0; A; R h J[ic] 
is derivable and there are substitutions 0 and 0 for 0 and 
A, respectively, such that M \= P[0][0] holds, that is, M 
entails P[0][0]. We use H \= he : t to mean that he has 
type t under the heap mapping TL. For instance, we have 
TL |= i : int(i). The following rule (heap-array) is for 
assigning array types. 

TL(h) = (hco ,... , hcn-i) H \= hco : r • • • H \= hc n -i : r 
H\= h: (r)array(n) 

We write {H,TZ) |= R, that is, (H, TV) entails R, if H |= 
lZ(i) : R(i) holds for every i £ dom(P). In other word, 
{H, 1Z) |= R means that the content in each register does 
have the type assigned by R. 

We state the type soundness theorem for DTAL below. 

Theorem 5.1. Let P = $1 : Pi;... ; l n : B n ) be a pro¬ 
gram and A = A(P). Assume h P[well-typed\ is derivable 
and A(Zi) = Rempty, where Rempty maps each register to 
type top. For every machine state Mo, If (0, Mo) — 







—-—- (coerce-top) - a 6 A - (coerce-type-var) - . , ^ ^ ' —--—— (coerce-int) 

0; A |= r < top 0; A \= a < a (f>;A\= int(a:) < int(j/) 

0, 0'; A |= n < t 2 . . <t> \~ 0 : <j> ( % A |= n < tsJS] . . . 

--- (coerce-exi-ivar-1) --- (coerce-exi-ivar-r) 

0; A |= 30‘ ,t\ <T2 0; A \= ti < 30 .T2 

0; A |= n = t 2 <t> 1= x = y 0; A |= R(i) < R'(i) for 0 < i < n r 

(coerce-array) - - (coerce-reg) 


Figure 9: Some type coercion rules for DTAL 
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Figure 10: Contexts 0*,; A Rk for k = 4,... ,9 


( ic,M ) then either (ic, M) —>p tfALT 1 , or(ic,M) -*p (*c',.A4') 
for some ic' and M'. In other words, the execution of a 
well-typed program in DTAL either halts normally or runs 
forever. 

The proof of this theorem is involved. We have to deal with 
a subtle issue involving shared pointers and impose some 
regularity condition on the heap mapping TL in a machine 
state in order to establish the result. We give some brief 
explanation on this issue. 

Suppose TL(h) = (0) for some h, 77(0) = 77(1) s= h, R(0) — 
(int)array(l) and R(l) = (nat)array(l), where we write nat 
for 3a : nat.int(a). We can now derive ( TL,1Z ) |= R since 
(0) can be viewed as both an integer array of size 1 and a 
natural number array of size 1. Clearly, if we store a negative 
integer into the array pointed by n, then the type of V2 is 
invalidated because it no longer points to a natural number 

Assume 0; A; R b J[ic] is derivable, M entails 7?[0][0] for 
some 0 and 9 and ( A4,ic ) — >p (A4 r , ic'), what we essentially 
need to prove is that 0'; A': R' h J[ic] is derivable for some 
0' and A' such that M! entails -R'[0'][0']. Unfortunately, 
the above example shows that this is not provable as it is 
simply false. In order to overcome the problem, we impose a 
regularity condition on the derivation of M |= R. Roughly 
speaking, we associate type r with heap address h whenever 
the rule (heap-array) is applied and a derivation is regular 
if a heap address is associated with at most one type. This 
notion of regularity is essentially the same as the notion of 
store typing in [2], which was used to address the circularity 
of references in ML. Clearly, there is no regular derivation 
for the above example: in order to derive M \= R, we have 
to associate h with at least two distinct types int (when we 
derive H |= 77(0) : 77(0)) and nat (when we derive TL \= 
n 1) : -R(l)). 

In essence, by a regular derivation of (TL, 77):;.|e*. R, we 
mean that there is a heap typing that maps each heap ad¬ 
dress h £ dom(77) to a fixed type and under this typing 
77(7) can be assigned the type R(i), that is, the value in each 
register has the type that is declared for the register. As a 


heap typing can never be altered (but it may be extended 
by the execution of newarray), We can then prove that if 
M \= -R[0][0] has a regular derivation then M' \= 77'[0'][0'] 
also has a regular derivation, where we use the notation in 
the above paragraph. The proof bears a great deal of simi¬ 
larity to the soundness proof in [2]. 

In summary, if we start with an entailment that has a reg¬ 
ular derivation, then all entailments in the proof of the type 
soundness of DTAL have regular derivations. Therefore, the 
scenario of shared pointers mentioned previously can never 
occur. This allows us to establish Theorem 5.1. Note the 
issue here, which we think is rather subtle to recognize, does 
not occur in either DML or TAL. Please see [16] for details. 

6. EXTENSION WITH SUM TYPES 

The programmer can declare in Xanadu a polymorphic 
union type as in Figure 12 for representing lists and then im¬ 
plement the length function. The concrete syntax < ’ a> list 
is for the type of lists in which all elements are of type ’ a 
(we use ’a for a type variable). Note that the union types 
in Xanadu correspond to datatypes in ML and the values 
of union types are decomposed through pattern matching. 
We informally explain the meaning of the switch statement 
in Figure 12; if xs matches the pattern Nil, the value of 
x is returned; if xs matches the pattern Cons(_, xs) (_ is 
a wild card), then we update xs with its tail and increase 
x by 1. The type following the keyword invariant states 
an invariant at the program point: xs is a list of length i 
and x is an integer of value j for some integers i,j satisfying 
i + j = n, where n is the length of the function argument. 

A union type is internally represented as a sum type. In 
the case above, a tag is used to indicate whether the out¬ 
most constructor of a list is Nil or Cons. We can compile the 
length function essentially in the following manner; we ini¬ 
tialize x with 0 and start the following loop; given a list xs, 
we perform a tag check to see whether it is Nil; if it is, we 
return *; otherwise, we know that the outmost constructor 
of xs must be Cons and it is unnecessary to perform another 
tag check; we can simply update xs with its tail, increase x 















!>, x = 0; A |= ro < r ••• (j>, x = n — 1; A |= r„_i < r 

(j>\ A |= choose(x, To, ■ ■ ■ , t„~ i) < r 


(coerce-choose-1) 


0;A | = T <Tj <j>\ =x = i 
4>-, A |= t < choose(x, ro,... , r n _i) 


(coerce-choose-r) 


Figure 11: Additional type coercion rules for sum types 


(’a) union list with nat = 

{Nil(0); {n:nat} Cons(n+l) of ’a * <’a>list(n)]- 
(’a){n:nat> int(n) length (xs: <’a> list(n)) { 
invariant: 

[i:nat,j:nat I i+j=n] (xs:<’a>list(i), x:int(j)) 
while (true) { 
switch(xs) { 

case Nil: return x; 

case Cons(_, xs): x = x + 1; 

> 

> 

exit; /* can never be reached */ 

> 

Figure 12: A list length function in Xanadu 


by 1 and loop again. 

We now extend the system of DTAL to handle sum types. 
In an implementation, we can use a pair on heap to represent 
a sum type sum (to,... , r„_i), which is often written as to + 
• • • + T n -i in the literature. The first element of the pair is 
an integer i such that 0 < i < n and the second element 
is of type Tj. We can use choose{x,To ,... ,r n _i) to stand 
for a type which must be one of to, ... ,-T n -i, determined 
by the value of x\ the type is t; if ahiafC Also we present 
some additional rules in Figure 11 for handling type coercion 
involving sum types (rules for type equality are omitted). 

Now we can define sum(jo, ... , r n - i) as: 

3a : nat„.int(a) * choose(a, To,... , r n -i), 

that is, a value of type sum(To, ■ ■ . , t„_i) is represented as 
a pair in which the first part is a tag determining the type 
of the second part. We present an example to illustrate the 
use of sum types. 

In Figure 12, we declare a dependent datatype in Xanadu 
for lists; Nil is given the type <’a> list(0), that is, it is a 
list of length 0; Cons is assigned the type 

{n:nat} ’a * <’a> list(n) -> <’a> list(n+l), 

indicating that Cons takes an element and a list of length n 
and yields a list of length n+ 1. This leads us to represent 
the type constructor list as follows, 

Ht.Aa.Hn : nat.(3<j>o-unit) + ( 3<f>i.a * ( a)t(a )), 

where /u is the fixed point operator and <po is n = 0 and (f >i 
is a : nat, a + 1 — n. If we unfold ( r)list(n ), we obtain the 
type ( 3<j>o.unit ) + (3<j>i.T * ( r)list(a )), which can be folded 
into ( r)list(n ). It is straightforward to apply this strategy 
to a general case of dependent datatypes. We provide two 
auxiliary instructions foldfy] r and unfold r to indicate the 


need for folding the type of r into t and unfolding the type 
of r, respectively. 

The DTAL code in Figure 13 corresponds to the Xanadu 
program in Figure 12. The state type following the label 
length indicates that the top element on the stack is a list 
and the second one is a label; the list is the argument of 
the function and the label is the return address (pushed 
onto the stack by the caller); the type of the label states 
that the top element of the stack is an integer, which is 
to be the return value of the function, and the rest of the 
stack is the same as the current stack excluding the top 
two elements. The state type following the label length 
precisely indicates that this is a function that accepts a list 
of length n and return an integer of value n. We regard the 
representation of dependent datatypes at assembly level as a 
significant contribution, which makes it possible to perform 
compilation with dependent types for programs in DML and 
thus certify more program properties. 

The DTAL code in Figure 13 is unsatisfactory for the 
following reason. In practice, the list constructors are usu¬ 
ally represented without tags for both efficiency and mem¬ 
ory concern. In other words, we can interpret ( a)list as 
2ft : nat2-choose(a, unit, a * ( a)list ). The reason is that 
it can be readily tested in practice whether a value equals 
() (which is commonly represented as a null pointer), and 
therefore there is no need for a tag. This optimized list rep¬ 
resentation can also be handled in DTAL. Please see [16] for 
details. 

The treatment of sum types extends the one in [3]. There 
indexed sums n +j t 2 (i = 1, 2) are introduced for types n 
and T2 in addition to the standard sum ti + T2. The typing 
rules for indexed sums essentially state that for i = 1, 2, 
irii (e) : Ti +j T2 is derivable if e : Ti is, where ini is used 
to indicate which rule is applied. To relate indexed sums to 
sum, there are subtyping rules for making ti +; T 2 a subtype 
of ti + T2 for i = 1,2. In DTAL, ti + t T2 can be interpreted 
as int(* — 1) * choose(i — 1 , ti, T 2) and the subtyping relation 
can be derived with the use of type coercion rules. 


7. IMPLEMENTATION 

We have prototyped a type-checker and an interpreter for 
DTAL and verified many examples, providing a proof of 
concept. The implementation and examples are available 
on-line [14]. 

We have also prototyped a compiler which produces DTAL 
code from source programs in Xanadu, a language with C- 
like syntax in which only top level functions are supported 
and no pointers are allowed. Xanadu shares many common 
features with languages like Safe C [9] and Popcorn [6]. The 
most significant feature of Xanadu is its type system, which 
supports a restricted form of dependent types that are sim¬ 
ilar to those in DTAL, though registers are replaced with 






length: (’r, ’a){n:natj [sp: ’a list(n) :: [sp: int(n) :: ’r] :: ’r] 

// [sp: int(n) :: ’r] represents the state type of the return 
// address (label) which is pushed on the stack by the caller. 

// Note that ’a list is represented as a dependent type internally 
pop rl // pop the list argument into rl 

mov r2, 0 // initialize r2 

loop: (’r, ’a){i:nat, j:nat I i+j=n} [rl: ’a list(i), r2: int(j), sp: [sp: int(n) :: ’r] :: ’r] 

unfold rl // 

load r3, rl(0) // load list tag into r3 (r3 = 0 or 1) 

beq r3, finish // goto finish if rl is empty (r3 = 0) 

load rl, rl(1) // rl: ’a * ’a list(i-l) (r3 = 1 since r3 is not 0) 

load rl, rl(1) // move list tail into rl 

add r2, r2, 1 // r2: int(j+l) 

jmp loop // loop again 

finish: ( ; r){n:nat> [r2: int(n), sp: [sp: int(n) :: ’r] :: ’r] 
pop rl // return address pops into rl 

push r2 // result pushes onto the stack 

jmp rl // return 


Figure 13: An implementation of the length function on lists in DTAL 


local variables in a program. Please see [15] for more de- 

The compilation is like compiling C into a typical untyped 
assembly language except that here we need to construct 
state types for labels. We have compiled all the examples in 
this paper. 2 

In Xanadu, we allow the programmer to provide loop in¬ 
variants in the form of dependent types so that significantly 
more array bound checks can be eliminated in practice. In 
Figure 14, the top part is a program in Xanadu, which ini¬ 
tializes an array with zeros, and the rest is the DTAL code 
compiled from the program. The function header: 

{n:nat} unit initialize(int vec[n]) 

indicates that for every natural number n, initialize takes 
an integer array of size n and returns no value. The type 
following the keyword invariant essentially states that i 
and 1 are of types int(a) and int(6), respectively, where a 
and b are natural numbers satisfying a + n. Note that 
n is the size of array vec. 

The Xanadu program can be compiled into the DTAL 
code excluding the state types for labels in a standard man¬ 
ner. This part is exactly like compiling a corresponding C 
program. We briefly mention the construction of the state 
types in Figure 14. Notice that the state type attached 
to loop is essentially translated from the type annotation in 
the source program. We simply modify the annotation to in¬ 
clude the types of variables not mentioned and then replace 
the variables with the registers to which these variables are 
mapped. We expect to formalize such a compilation strat¬ 
egy in future and show that a well-typed Xanadu program 
can always be thus compiled into well-typed DTAL code. 
At present, we may merely view the type annotations in 
Xanadu as compilation hints to generating well-typed DTAL 

2 We currently do not have a pretty printer for the generated 
DTAL code, and therefore we took the liberty to prettify the 
DTAL code presented in this paper. 


8. RELATED WORK 

There is a great deal of ongoing research on certifying com¬ 
pilers. Examples of certifying compilers for type and mem¬ 
ory safety include various ones compiling Java into Java vir¬ 
tual machine language (JVML), Touchstone compiling Safe 
C into a form of proof-carrying code (which we call TPCC) 
[9], TIL [11] and its successor TILT and FLINT/ML [10] 
compiling SML [5] into a typed intermediate language [11], 
and ROML [12] compiling a restricted set of ML into a por¬ 
tion of C that is type safe. 

DTAL is an extension of TAL with dependent types, and 
it can be readily transformed into a TAL-like language if 
one erases all syntax related to type index expressions. In 
this respect, DTAL generalizes TAL. In DTAL, initializa¬ 
tion is treated differently from in TAL. A type in TAL can 
be annotated with a flag to indicate the initialization sta¬ 
tus of a value with this type, but the type top is used in 
DTAL to represent the type of all uninitialized values. This 
strategy works because every array (and tuple if presented) 
is initialized upon allocation in DTAL. 

The notion of proof-carrying code introduced in [8] can 
address the memory safety issue in mobile code as follows. 
The essential idea is to generate a proof asserting the mem¬ 
ory safety property of code and then attach it to the code. 
The proof carried by the code can then be verified before 
execution. This is an attractive approach but a challenging 
question remains, that is, how to generate a proof to as¬ 
sert memory safety property of a (large and complex) pro¬ 
gram. The Touchstone compiler [9], which compiles pro¬ 
grams written in a type-safe subset of C into proof-carrying 
code (TPCC for Touchstone’s PCC), handles this question 
through a general verification condition generator [1], gener¬ 
ating verification conditions for both type safety and mem¬ 
ory safety. Also TPCC performs some loop invariant syn¬ 
thesis for eliminating array bound checks. In general, TPCC 
seems more involved in handling type safety when compared 
to TAL, while TAL seems less flexible than TPCC. 

DML is a functional programming language that enriches 
ML with a restricted form of dependent types [18], allow- 





{n:nat} unit initialize(int vec[n]) { 
i = 0; 1 = arraysize(vec); 

invariant: [a mat, b:nat I a + b = n] (i: int(a), 1: 
while (1 > 0) { vec[i] =0; i = i + 1; 1 = 1-1; > 


y 


int(b)) 


init: (’r) {n:nat> [sp: 

pop rl 

mov r2, 0 

arraysize r3, rl 


int array(n) 


[sp: ’r] : : ’r] 




(’r) {n:nat, a:nat, b: 
[rl: int array(n), r2: 
bite r3, finish 

store rl(r2), 0 

add r2, r2, 1 

sub r3, r3, 1 

jmp loop 


:nat I a + b = n> 

: int(a), r3: int(b). 


sp: 


[sp: ’r] :: ’r] 


finish: (’r) [sp: [sp: ’r] :: ’r] 


Figure 14: Implementations of an initialization function in Xanadu and DTAL 


ing the programmer to capture more program invariants 
through types and thus to detect more program errors at 
compile-time. In particular, the programmer can capture 
more invariants in data structures by refining datatypes 
with type index expressions. For instance, one can form 
a datatype in DML that is precisely for all red/black trees 
and program with such a type. The type system of DML is 
also studied for array bound check elimination [17]. 

DTAL stands as an alternative design choice to TPCC, ex¬ 
tending TAL with a form of dependent types that is largely 
adopted from DML. The design of DTAL is partly motivated 
by an attempt to build a certifying compiler for DML. Un¬ 
like TPCC, there are no proofs attached to DTAL code. The 
verifier for DTAL code is a dependent type-checker consist¬ 
ing of a constraint generator and a constraint solver. In 
general, proof verification is easier than proof search, and 
therefore the TPCC startup overhead should be less than 
that for DTAL code, though it seems too difficult at this 
stage to perform a meaningful comparison. In future, we 
are also interesting in constructing a proof asserting the 
well-typedness of DTAL code and thus provide a means to 
generating a form of proof-carrying code from programs in 
Xanadu. This is appealing as Xanadu allows the program¬ 
mer to formally supply program invariants that may be too 
sophisticated to synthesize and thus facilitates the construc¬ 
tion of proof-carrying code. 

We view DTAL as a type-theoretic approach to reasoning 
about memory safety at assembly level. With a stronger 
type system than that of TAL, DTAL is expected to capture 
program errors that can slip through the type system of 
TAL. This is supported by the fact that DML can capture 
program errors in practice which eludes the type system of 
ML. 


9. CONCLUSION 

TAL is a typed assembly language with a type system at 
assembly level. The type system of TAL contains some limi¬ 
tations that prevent certain important loop-based optimiza¬ 
tions such as array bound check elimination and tag check 
elimination. We have enriched TAL with a restricted form of 
dependent types and the enrichment leads to a dependently 
typed assembly language (DTAL) that overcomes these lim¬ 
itations. We have established the soundness of the type 
system of DTAL and implemented a type-checking algo¬ 
rithm. We have also constructed a prototype compiler which 
compiles Xanadu programs into DTAL, where Xanadu is a 
programming language with C-like syntax that supports a 
dependent type system similar to that of DTAL but signifi¬ 
cantly more involved. 

In future work, we intend to study compilation with de¬ 
pendent types, translating programs in DML into DTAL. 
We feel that the presented approach to representing depen¬ 
dent datatypes in DTAL has made a significant step towards 
achieving this goal. On a larger scale, we are interested 
in both using types to capture more program properties in 
high-level languages and constructing certifying compilers 
to translate these properties into low-level languages. 

10. ACKNOWLEDGMENT 

We thank the anonymous referees for their detailed con¬ 
structive comments, which have undoubtedly raised the qual¬ 
ity of the paper. 

11. REFERENCES 

[1] R. W. Floyd. Assigning meanings to programs. In 
J. T. Schwartz, editor, Mathematical Aspects of 
Computer Science, volume 19 of Proceedings of 
Symposia in Applied Mathematics, pages 19-32, 





Providence, Rhode Island, 1967. American 
Mathematical Society. 

[2] R. Harper. A simplified account of polymorphic 
references. Information Processing Letters, 

51:201-206, 1994. 

[3] R. Harper and C. Stone. A type-theoretic 
interpretation of Standard ML. In G. Plotkin, 

C. Stirling, and M. Tofte, editors, Robin Milner 
Festschrifft. MIT Press, 1998. (To appear). 

[4] P. Martin-Lof. Intuitionistic Type Theory. Bibliopolis, 
Naples, Italy, 1984. 

[5] R. Milner, M. Tofte, R. W. Harper, and 

D. MacQueen. The Definition of Standard ML. MIT 
Press, Cambridge, Massachusetts, 1997. 

[6] G. Morrisett et al. Talx86: A realistic typed assembly 
language. In Proceedings of Workshop on Compiler 
Support for System Software, 1999. 

[7] G. Morrisett, D. Walker, K. Crary, and N. Glew. From 
system F to typed assembly language. In Proceedings 
of ACM Symposium on Principles of Programming 
Languages, pages 85-97, January 1998. 

[8] G. Necula. Proof-carrying code. In Conference Record 
of 2fth Annual ACM Symposium on Principles of 
Programming Languages, pages 106-119. ACM press, 

1997. 

[9] G. Necula and P. Lee. The design and implementation 
of a certifying compiler. In ACM SIGPLAN ’98 
Conference on Programming Language Design and 
Implementation, pages 333-344. ACM press, June 

1998. 

[10] Z. Shao. An Overview of the FLINT/ML compiler. In 
Proceedings of ACM SIGPLAN Workshop on Types in 
Compilation (TIG ’97), June 1997. 


[11] D. Tarditi, G. Morrisett, P. Cheng, C. Stone, 

R. Harper, and P. Lee. A type-directed optimizing 
compiler for ML. In Proceedings of ACM SIGPLAN 
Conference on Programming Language Design and 
Implementation, pages 181-192, June 1996. 

[12] A. Tolmach and D. P. Oliva. From ML to Ada(!?!): 
Strongly-typed language interoperability via source 
translation. Journal of Functional Programming, 

8(4):367-412, July 1998. 

[13] H. Xi. Dependent Types in Practical Programming. 

PhD thesis, Carnegie Mellon University, 1998. pp. 
viii+189. Available as 

http://www.cs.emu.edu/~hwxi/DML/thesis.ps. 

[14] H. Xi. Implementations and Examples for Xanadu and 
DTAL. Available at 

http://www.ececs.uc.edu/'hwxi/Xanadu-DTAL, 1999. 

[15] H. Xi. Imperative Programming with Dependent 
Types. In Proceedings of 15th IEEE Symposium on 
Logic in Computer Science, pages 375-387, June 2000. 

[16] H. Xi and R. Harper. A Dependently Typed Assembly 
Language. Technical Report CSE-99-008, 

Oregon Graduate Institute, July 1999. Also available as 

http://www.ececs.uc.edu/~hwxi/academic/papers/DTAL.ps. 

[17] H. Xi and F. Pfenning. Eliminating array bound 
checking through dependent types. In Proceedings of 
ACM SIGPLAN Conference on Programming 
Language Design and Implementation, pages 249-257, 
Montreal, June 1998. 

[18] H. Xi and F. Pfenning. Dependent types in practical 
programming. In Proceedings of ACM SIGPLAN 
Symposium on Principles of Programming Languages, 
pages 214-227, San Antonio, January 1999. 



